Self-organization via Active Exploration 
in Robotic Applications 
Phase II: Hybrid Hardware Prototype 

Final Progress Report 

Haluk Ogmen, Principal Investigator 
Ramkrishna V. Prakash, Research Assistant 
Department of Electrical Engineering 
University of Houston 
Houston, TX 77204-4793 
December, 1993 



Contents 


1 Introduction 

2 The Robot 

2.1 The V ision System 

2.2 The Ann System 

2.3 FRONTAL 

2.1 Communication Protocol: Sockets 


3 Simulations 

3.1 The novelty detection network 

3.2 Reinforcement, versus novelty 

3.3 The delay neuron 

3.1 Variable criterion categorization 

3.5 Spatial novelty and attentive scanning in FRONTAL 

4 User-Interface 

5 Limitations 

6 Conclusion and future work 
A Appendix A 

A.l Reinforcement-novelty detection network 

A. 2 Reinforcement based classification 

A .3 The combined FRONTAL network 

B Appendix B 

C Appendix C 


22 

23 

23 


34 

36 

36 

38 


33 

30 

11 

44 

47 


l 



1 Introduction 


In many environments human-like intelligent behavior is required from robots to assist and/or 
replace human operators. The purpose of these robots is to reduce human time and effort in 
various tasks. Thus, the robot should be robust and as autonomous as possible in order to 
eliminate or to keep to a strict minimum its maintenance and external control. If the robot 
requires more human intervention than the task it accomplishes then it would be useless for most 
of the applications. Moreover, if the robot has to function in an uncontrolled environment where 
unpredictable changes can occur, and if its maintenance is kept to a strict minimum then the 
design requirements become more complex. In particular, direct program control or model based 
traditional approaches to robotic problems prove to be inadequate because they cannot cope 
with such uncontrolled environments. Then, what are the key issues of the design problem ? The 
analysis of the requirements outlined above leads to the following properties: 

(i) Fault tolerance . : This property can be achieved by use of an adequately organized dis- 
tributed architecture incorporating some redundance. Fault-tolerance will let the robot to main- 
tain an acceptable performance immediately after the occurrence of faults in the hardware or 
changes in its structural parameters (e.g. a change in the arm joint parameter due to mechanical 
fatigue) . 

(ii) Self organization (which augments fault-tolerance by completely correcting the perfor- 
mance) detects and analyses faults or external changes and consequently achieves the correct 
performance under these new conditions. 

(iii) Intelligence is necessary to achieve the understanding required by the self-organization 
process and also to analyze the environment and to predict future events. Moreover, intelligence 
is also necessary to establish a natural communication (e.g. language) between humans and the 
robot. 

But how can these properties be implemented in a robot ? These properties are drastically 
different from the ones widely used in traditional design and require a careful analysis of the 
underlying phenomena. A good insight can be gained by considering principles found in studies 
directed toward “systems" that posses all these qualities: the human ! Unlike many primitive 
animals which are almost completely genetically wired, human infants undergo an extensive 
developmental period during which they learn to control and coordinate various parts of their 
body. Moreover, they actively explore the environment to transform simple instincts to habits 
and to operational structures using novelty and complex associations which result from the 
interaction with the environment. It is important to emphasize how this exploratory activity 
is fault tolerant and self-organizing: The growing child's physical characteristics continually 
change (the arms become longer etc.). If the control were based on a strict model, it would 


1 



fail to function as soon as the child grows a little because none of the parameters would be 
the same. There is a large number of studies that outline various principles regulating this 
developmental stage as well as its relationship with the adult performance (e.g. Piaget 1963. 
1967. 1969. 1970). These classical studies show how self-organization and intelligence emerge 
from active exploration. It demonstrates subtle issues underlying the transformation of instincts 
to habits and to operational structures. The exploratory activity requires a careful combination 
of internal drives and environmental cues. Until recently, these findings were limited to the realm 
of psychology. However, neural network theory developed tools that enable us to implement these 
findings for technological problems. In Phase I of this project, we developed such a neural network 
architecture. It captures some fundamental aspects of human categorization, habit, novelty, and 
reinforcement behavior. The model, called FRONTAL (in reference to the frontal lobes), is a 
“cognitive unit" regulating the exploratory behavior of the robot. 

In the second phase of the project, we interfaced FRONTAL with off-the-shelf robotic arm 
and a real-time vision system. The components of this robotic system, a leview of FRONTAL, 
and simulation studies are presented in this report. 

2 The Robot 

The self-organizing robotic system is shown in Figure 1. It comprises of the following four parts: 

• the vision system 

• the arm system 

• the neural network (FRONTAL) and 

• the communication protocols. 

The vision system enables the robot to see its surroundings, while the arm system allows it 
to interact with the objects present in its field of view. The neural network FRONTAL which 
is the “cognitive controller” of the robot enables it to actively explore its surrounding and to 
adapt its behavior to changes in the environment. The vision system and the arm system of the 
robot communicates with FRONTAL via communication protocols. During the initial stage of 
the development of this robotic system a simple communication protocol using DARPA Inter- 
net protocol suite (TCP/IP) sockets was used. In the later stage this communication protocol 
was replaced by a more versatile protocol developed using Telerobotics interconnection Protocol 
(TELRIP). 

The vision system comprises of a real-time image processing system called the MaxVideo 20 
manufactured by DATACUBE. a grey scale camera and an object recognition software called 


2 



Frontal Sytem 



Figure 1: The self-organizing robotic system. It comprises of the following four systems: (i) 
the vision system (ii) the arm system (iii) the neural network based “cognitive controller " called 
FRONTAL and finally (iv) the communication protocols. The vision system consists of a camera 
which, in conjunction with the MaxVideo image processing system and object detection software, 
yields a real-time image processing system, capable of detecting objects in the robot s environ- 
ment. The arm system consists of a PUMA 562 arm and associated software to calculate its 
inverse kinematics. The neural network based “cognitive controller", called the FRONTAL, is 
responsible for generating and co-ordinating purposeful behaviors foi the lobot. The \aiious 
components of the robotic system communicate with each other via communication protocols 
developed using TCP/IP sockets. These were later replaced by an unified protocol developed 
using TELRIP 


3 




BLOBS. The MaxVideo 20 system is mounted on a VME cache and it communicates with a 
Sun Sparc II via a VME bus. The vision system was programmed to threshold the input from 
the camera so as to isolate objects from their background, thus accomplishing figure-ground 
segregation. This thresholded image constitutes the input to the robotic system. To facilitate 
the simultaneous viewing of the robot 's environment and its inputs, the image processing system 
was programmed to toggle between two modes every other clock cycle. In the first mode, the raw 
image is captured and sent directly to the video monitor. In the second mode, the raw image 
is thresholded and simultaneously sent to the video monitor as well as the object recognition 
software BLOBS. The thresholded as well as the non-thresholded frames were simultaneously 
displayed on a video monitor by splitting the screen into two parts. Thus, one could monitor 
the input to and the output from the vision system “simultaneously". This reduced sampling 
of the environment was much faster than any dynamic changes that were induced in the robots 
environment. 

The filtered images were then processed continuously by a software running on a Sun Spaic 
II system which generated a symbolic representation of the object s features and its location. 
The details of the vision system and an evaluation of its performance is presented in the later 
Subsection 2.1 and Section 5. 

The robotic arm system 1 consisting of two PUMA 562 robot arms. A three digit Stand- 
ford/JPL dexterous hand is attached on the right arm. The left arm has a two digit gripper. 
In our current implementation, only the left arm with the gripper is used to interact with the 
environment. The PUMA 562 arms are being controlled by an Unix workstation which com- 
muni cates with the arm controller via a \ VIE bus. The controller for the lobotic aims is built 
by Cybernetics Inc. It allows one to control as well as monitor every joint angle of each arm. 
Moreover, torque sensors positioned at various joints yield a measure of the force exerted at the 
joints. Subroutines have been written to facilitate an easy control of this robotic arm system, 
details of which will be presented later. 

So far we have discussed the sensory and the motor systems of this anthropomorphic robotic 
system. Hence a brief discussion of the “cognitive controller" of this robot seems warranted. A 
neural network called FRONTAL (Ogmen and Prakash, 1991) controls the vision system and 
the arm system. This neural network is capable of identifying and selectively attending to novel 
as well as rewarding objects in its environment. At the same time, it is capable of actively 
reorganizing its behavior depending on the external reinforcement signals. This neural network 
is implemented on an Amdahl supercomputer and it communicates with the vision system as 
well as the arm system using communication protocols. 

* This robotic system called Dexterous Anthropomorphic Robotic Testbed, (DART), is being developed b\ the 
Robotics and Automation Division of NASA. -ISC'. 


4 



In the following sections a more detailed descriptions of the vision, arm, FRONTAL and the 
communication protocol are presented. 

2.1 The Vision System 

The primary component of the vision sietup consists of a real-time image processing system called 
the DataCube MaxVideo System 20 (MaxVideo 20). The MaxVideo system comprises of various 
specialized hardware modules (called the MaxVideo modules) which are connected to each other 
via a MAXbus. This image processing system can be programmed by a host computer using an 
object-oriented based software called ImageFlow, which is resident on the host computer. The 
communication between the image processing system and the host computer is via a VME bus 
and thus any computer system capable of VME bus based I/O (input/output) communication 
can be used to control this image processing system. In the current configuration, the MaxVideo 
system is interfaced with a Sun Sparc II system. Figure 2 shows an overview of the setup of 
the vision system. The MaxVideo modules along with the MAXbus provides a 10MHz (103 
nsec/pixel) synchronous pipelined DSP (digital signal processing) engine which is capable of 
acquiring, processing and displaying images at rate of 30 frames/sec 2 . This system is capable of 
acquiring images in any one of the following variety of input data precision: (i) 8/12 bit analog 
RS-170/CCIR (standard television), (ii) 8/12 bit asynchronous analog, (iii) 8/16 bit digital, (iv) 
24-bit NTSC (video). RGB. YIQ or (v) 36-bit RGB RS-170/CCIR format. It can process these 
raw images with 8 or 16 bit precision 3 and store them with either 8, 16, 24 or 64 bit precision. 
Displaying of the processed images can be done in one of the following data precision forms: (i) 
8-bit RS-170/CCIR B/W or pseudocolor, (ii) 8-bit High Resolution B/W or pseudocolor, (iii) 
24-bit NTSC' or RGB or YIQ or (iv) 24-bit High Resolution RGB. The MaxVideo system also 
provides means to add an 8-bit graphics overlay image along with processed the image being 
displayed. 

The MaxVideo system consists of the following five modules: 

• Analog Scanner (AS) 

• Architectural Adapter (AA) 

• Analog Generator (AG) 

• Advanced Pipeline Processor (AP) 

2 This speed is for a standard 512 x 484 pixel image. The MaxVideo system is capable of processing high 
resolution images (409G x 409G pixels) but at a slower rate. However the displayablc resolution of the system is 
only 1024 x 1024 pixels. 

3 The AU MaxVideo module is however capable of processing with 20 bit precision. 


5 



VME Cage 



Sun Sparc II System 
with ImageFlow 


Figure 2: This figures shows a block diagram of the various components of the vision system. 
The Max Video 20 system board is placed in a VME cage which is accessible to the Sun Sparc 2 
computer. The ImageFlow software which is object-oriented control software for the MaxVideo 
system is resident in the Sun Sparc 2 computer. Each processed frame by the Max\ ideo system 
is displayed on a Sony monitor and simultaneously grabbed by the Sun computer to perform 
object recognition. The Sun Sparc 2 is in communication with Amdahl via a sockets based 
communication protocol. 


6 





• Arithmetic Unit (AU) 

A brief overview of the various modules and their capabilities is presented in Appendix B. 
In the current implementation of the vision system, only the AP moudle is used in conjunction 
with the AS. AA and AG modules. Figure 3 shows the overall setup of the MaxVideo 20 system 
for thresholding images in real time. As can be seen from the figure, the MaxVideo system is 
configured in two different modes (pathways) called PATs. The AS module receives a multisync 
signal from the CCD camera and routes it alternatively through these two PATs. The first PAT 
goes directly from the AS module through the AA module and the AG module to the monitor. 
When the MaxVideo system runs in this configuration it displays the captured raw image directly 
on the monitor. The second PAT is from the AS through the AA via the AP and finally through 
the AS to the monitor. The AP module is configured to threshold the raw image by using a 
generic look-up table. In this configuration the MaxVideo system, thresholds the image that 
is captured by the CCD camera. The two paths are toggled every other clock cycle and they 
are alternatively displayed on an external monitor. While the MaxVideo system is in the second 
mode, the threshold image is read into the host computer. The threshold image is then processed 
via software to locate the different objects in the image. Once the objects have been detected, 
their location and type are identified and the information is transmitted to FRONTAL over the 
communication protocol. 

The software used for object detection consists of a blob detection algorithm called BLOBS 4 . 
BLOBS groups neighboring pixels of similar color as belonging to a single bolb (or object). It 
also assigns pixels of a blob having only 3 neighbors as the edge pixels of the object. The area 
and the perimeter of the objects are detected by using the total number of pixels and the number 
of edge pixels respectively of the object. Three different kinds of objects (equilateral triangle, 
circle, stars) were required to be detected by BLOBS. A compactness measure given by, 

perimeter 2 

compactness = - . (1) 

Area 

is used to differentiate amongst the three different kind of objects. An advantage of this measure 
is that it is a rotational invariant measure. The image grabbed by the CCD camera however is 
distorted because of the aspect ratio of the pixels of the camera (which is the ratio of a pixel 
and is equal to 0.75). This leads to an inconsistent measure of the area and the perimeter of the 
objects as they are rotated about an axis perpendicular to the camera. To avoid this problem 
BLOBS scales each pixel to take of the aspect ratio problem. Nine different types of objects 
consisting of three different sizes (small, medium large) and shapes (circle, equilateral triangle, 
star) (which were latter used for the simulations) could be reliably detected by the vision system. 


'This software was developed in conjunction with Jeff Kowing and Bob Goode of NASA. JSC. 




Figure 3: The two modes in which the MaxVideo system is operated are shown. (A) Shows the 
first mode in which the MV20 system is configured to send the raw image captured by the CCD 
camera directly to monitor. In the second mode the raw image is first thresholded by the AP 
module and then sent to be displayed on the monitor. Simultaneously the thresholded image is 
also sent to BLOBS which is an object recognition software. The MaxVideo 20 System is toggled 
between the two modes every other system clock cycle. As the system clock rates are about 1000 
times faster than the dynamic changes in the environment, the processing of every other image 
does not effect the performance of the robotic system. 


8 







Figure 4: The above two plots summarize the performance of the vision system. The top graph 
plots the compactness ratio of the nine different objects against their area. Twenty objects of 
each of the nine kinds were presented to the vision system. The circles represent the average 
value of the compactness ratio for each of the objects and the bars represent the range. The 
objects could be grouped primarily into three types depending on their compactness ratio as 
indicated by the horizontal lines. The second graph is a similar plot but showing the range in the 
area, of the nine objects. As can be seen, for a given object type (compactness ratio) the object 
sizes do not overlap. 


9 






Figure 5: The the various locations ill which the objects that are used for studying the perfor- 
in mice of the vision system are placed. 


Figure -I gives a- summary of the performance of the vision system. The vision system was t ested 
twenty times for each type of object. The average compactness ratio as well as the ranges of 1 lie 
the three types of objects air show at the top in Figure 4. As can be seen, there is no ovinia]) 
between t hi' three types of objects. The bottom graph in Figure 4 plots the mean area and their 
range for all the nine objects. For any given type of object there was no overlap between the 
sizes. Figure 5 shows a scatter plot of the various locations in the visual field at which the objects 
were placed for testing the vision system's performance. 

2.2 The Arm System 

DART (Dexterous Anthropomorphic Robotic Testbed) is a robot developed by Automation and 
Robotics Division at NASA. JSC. This robot shown in Figure 6 is built with an anthropomorphic 
design in mind. It consists of two PUMA 562 arms, a Standford/JPL dexterous hand on the 
right and a gripper on the left hand. The two PUMA arms rest on a base that is controlled by 
a motor to enable the robot rotate around its central axis (shown by x-x in Figure 6). 

Each of the two PUMA 562 arms has 6 degrees of freedom as shown Figure 6. A Cybernetics 
servo controller consisting of three Central Processing Units (CPU s) controls the joints ol each 
arm. The control of the joints are accomplished by using Position Derivative (PD) based servo- 
loops. Position, velocity and torque control of the arm can be achieved via the controller. 1 lie 

10 


ORIGINAL PAGE tS 
OF POOR QUALITY 




The Camera System 



Figure 6: The DART System. It comprises of two PUMA 562 arms and a vision system. The 
whole assembly rests on a base which can be rotated about the central x-x’ axis by a motor. 
Each of the PUMA 562 arms has six joints yielding a total of six degrees of freedom for each 
arm. The right hand consists of a Stanford/ JPL three digit hand. Each of the digits can consists 
of three joints which can be controlled independently. The left hand of the DART comprises of 
a two digit gripper which is controlled by a single motor. The Tadpole Unix Workstation is used 
to control the two arms and their hands. 


11 







controller is also capable of applying the brakes at the joints of PUMA 562 arm. Commands to 
the controller can be issued by writing to a shared memory location that is read by the three 
CPU's of the controller. The various states such as the joint angles and the torque at each joint 
of the arm are written by the CPU’s on to shared memory locations which the computer can read 
or write to. The internal states of the arm are updated every millisecond. This fast update rate 
enables to achieve near real time feedback control of the arm. The inverse and forward kinematic 
routines that were used for controlling the arm are based on the solutions for PUMA 562 arms 
available in standard robotic text books (Craig, 1989) 5 . 

The Standford/JPL three digit right hand of the robot comprises of two fingers and a thumb. 
The hand is controlled by a set of 12 servo-motors via a set of steel cables. Strain gauges located 
at the base of each finger provides tension feedback which provides information about the applied 
force. Also position feedback of the three digits is obtained by reading out the value of the encoder 
for each of the joints. The left hand comprises of a two digit gripper which work in unison as 
the digits are controlled by a single motor. The gripper motor is controlled independent of the 
motors of the arm joints 6 . Currently only the left arm and hand of the robot is used. In the 
future, we would like to use the three digit right hand to perform dexterous tasks. 

When the robotic system wants to pick an object from its environment, it sends via the 
communication protocol the location of the object to the computer controlling the PUMA arm. 
On receiving the spatial location of the object, the computer computes the inverse kinematics 
for the PUMA arm. A trajectory for the arm motion is generated via joint interpolation. This 
interpolated set of joint-space points is written into the shared memory of the arm controller 
by the computer. On reaching the required location, the gripper motor is initiated to grab the 
object. After grabbing the object, the robot arm is then commanded to return back to its default 
position and releases the picked object into a bin. The performance of the arm is presented in 
the Section 5. 

2.3 FRONTAL 

The neural network called FRONTAL that controls the robot is shown in Figure 7. FRONTAL 
comprises of the following four parts: 

• spatial novelty network, 

• attentive scanning network, 

• object novelty network and 

‘'Those routine were developed and tested by Mr. Larry Li. 

6 Details regarding the gripper operation is given in Appendix C. 


12 



• behavioral categorization network. 

The spatial novelty network (shown in the bottom left hand corner of Figure 7) comprises of 
an array of gated dipoles which are inter-connected via a winner-take-all layer of neurons. This 
network enables the robot to detect a new object that enters as well as an existing object that 
leaves its field of view. The working of the spatial novelty circuit can be better understood 
by studying how a single gated dipole functions. A gated dipole network is shown Figure 8. 
It comprises of two parallel channels called the “ON" and the "OFF" channels respectively, 
which inhibit each other. Both channels receive a common arousal signal “I" while the external 
input signal J" is applied only to the “ON" channel. The input signals to these channels are 
conveyed by depletable transmitters (marked by the square). The “ON channel activity provides 
a measure of the novelty of the applied external signal. As the “ON" channel inhibits the “OFF" 
channel the removal of external signal “J" yields a transient reduction in the inhibition until the 
“ON" channel transmitter is replenished. This transient reduction in inhibition on the “OFF" 
yields a concomitant transient increase in the activity of the “OFF channel. Together the 
“ON" channel and the “OFF” channel activities provide a measure of the novelty of an applied 
external input signal “J" and a signal indicating its removal. Figure 9 illustrates a neural network 
consisting of an array of interconnected gated dipoles capable of encoding novelty. The winner- 
take-all layer neurons of this novelty detection network also get input from the reward and punish 
neurons that encode the external reinforcement signals thus enabling the circuit to weight these 
signals againsts the novelty of the input. Simulations of the spatial novelty network alone and 
in combination with the reinforcement signals are presented later. Variations of this network 
are used in FRONTAL for detection of spatial novelty and object novelty. An array of these 
gated dipole networks, which constitute the spatial novelty circuit enables the robot to detect 
the introduction of a new object as well as removal of an old object from its surrounding. Each 
of the gated dipole correspond to a unique spatial location in the field of view of the robot. The 
neurons in the winner-take-all layer which receive inputs from both “ON" and “OFF" channels 
of their respective gated dipole. 

The attentional scanning network shown in the upper left corner of Figure 7 enables the 
robot to scan all the objects present in its environment. This network comprises of arousal, and 
inhibitory feedback neurons. They play a role in temporarily disengaging the attention of the 
robot from the current object. This in turn allows the robot to shift it s attention to another 
object in its surroundings. The duration of this disengagement is controlled by delay neurons. 
The arousal neuron receives an inhibitory signal from the categorization network which ensures 
that the attention of the robot is not disengaged during categorization of the object. 

While the robot attends to a particular object, the object novelty network which is in the 
far right of Figure 7 categories object into different types and ascertains whether that, type is 


13 



BchavKnl 

Categorization 



Visual Inputs 


Figure 7: The "cognitive controller unit” : Visual inputs shown at the bottom left of the figure 
are processed by spatial novelty and attentive scanning networks. The lattei determines the 
spatial focus of attention. The features of the object present in that spatial location are sent 
to "behavioral" and “object-type” categorization networks. The behavioral categories consist 
of "good” and “bad” . The outputs of the object-type categorization network are fed to object 
novelty network (the gated dipoles at the right of the figure). These gated dipoles are connected to 
a winner-take-all network which also receives inputs from the behavioral categorization network 
(excitatory from the good category and inhibitory from the bad category). When there is a 
winner in this network, a motor command signal is sent to the robot arm to initiate a visually 
guided reach movement towards the winning object- 


id 





J I — I inhibitory 

Figure 8: A gated dipole is shown in this figure. It comprises of two parallel channels each 
receiving a common arousal signal “I" . Each of these channels has an inhibitory effect on the other. 
The channel receiving the external signal “J" is called the u ON” channel and the other is called 
the "OFF” channel. The dark squares represent synapses containing depletable transmitters. 
The transmitted signal in each channel depends on the total input received and the amount of 
transmitter present in that channel. Initially, when no external input is applied, the activities in 
“ON” and “OFF” channels are the same since they receive the same input. On application of 
the external signal “J”. the “ON” channel has a larger activity than the “OFF” channel. Since 
the rate of depletion of the transmitter is dependent on the input, the longer the signal “J” is 
on the greater is the depletion of the transmitter in the “ON” channel synapse. This causes the 
activity of the ” ON” channel to slowly decay and thus gives a measure of the novelty of input 
“J” . 


15 



o o 


o o 


J 



I 


J 


4 


excitatory 

inhibitory 


4 plastic excitatory 
4 plastic inhibitory 


Figure 9: A neural architecture for novelty detection comprising of an array of gated-dipoles. 
These gated dipoles are connected to a winner-take-all network. The reinforcement signals also 
influence this decision of the novel stimulus by gating the neurons in the winner- take-all layer via 
additional "reward”, “punish” and “delay” neurons. (Modified from Levine and Prueitt, 1989.) 


16 



novel. The behavioral categorization network which is in the center of Figure 7 categorizes the 
object according to its behavioral significance (i.e good objects are those associated with positive 
reinforcements and bad objects are those associated with negative reinforcements). 

The object novelty network comprises of two parts, the object-type categorization network 
and the novelty detection network. The object-type categorization network is an ART network 
that categorizes the input objects to different types depending on its features. The output of 
the categorization layer is then fed to a novelty detection network comprising of gated dipoles 
which determine whether the object type is novel. The output of these gated dipoles aie fed 
to a winner-take-all network. This winner-take-all network also receives inputs from the behav- 
ioral categorization network. The combined object novelty as well as behavioral categorization 
networks signals is used to drive the robot s arm. 

The behavioral categorization network in the center of Figure 7 comprises of an ART network. 
This ART network is modified to dynamically change its internal criterion for categorization. 
Figure 11 gives a more detailed view of a network having similar properties as the behavioral 
categorization network of the FRONTAL. In this network, there are three features and four 
categories. To understand how the behavioral network categorizes an object, considei Figuie 10. 
The input object is shown at the bottom of the figure. The two objects at the top represent the 
templates for the “good" and the “bad" categories. It can been seen from the figure that the 
categorization of the input results in an ambiguity if the criterion to be used in the categorization 
is not know. The habit and the reinforcement signals guide the network in its choice of the 
categorization criterion. The reinforcement neuron encodes the externally issued leinfoicement 
signals to the robot. This non-specific signal is correctly assigned to the network's current choice 
of internal criterion by the match neuron. Both the reinforcement and match neurons are shown 
in the behavioral categorization network of Figure 7. The habit neurons at the bottom of the 
behavioral categorization network memorize the past experience of the network. The bias neurons 
combines reinforcement and habit signals to generate the appropriate internal criterion to be 
used to categorize input objects. Thus this network dynamically modifies its internal criterion 
for categorization depending on its past experiences and the reinforcement signals it receives. 

The ambiguity neuron (shown at the top of the behavioral categorization layer) enables the 
network to assign the input object to one of the behavioral categorizes in ambiguous situations. 
The ambiguity neuron accomplishes this by biasing one of the category neurons. The decision 
making neurons filter the transients generated by category layer of neurons (i.e the F- 2 layer of 
ART) during competition. This suppression of spurious transients and passing of steady state 
signals enables this network to be interfaced with other networks in a continuous non-algorithmic 
manner. Simulations of the working of this network are presented later. 


17 



( 4 


Template 
for the 

good” category 

* 


Template 
for the 

“bad” category 



Input 


Figure 10: The input object shown at the bottom of the figure has to be categorized into one of 
two categories whose templates are shown at the top of the figure. The template at left may be 
for example for “good objects" (the system will then pick this object) and the template at right 
may be for “bad objects" that the system learned to avoid through reinforcement signals. The 
categorization here is ambiguous in that if color is taken as criterion then the input is a good 
object but if shape is taken as criterion then the input is a bad object. 


18 


AMBIGUITY NEURON 



Figure 11: A neural network architecture capable of dynamically modifying its internal criterion 
(shape, number, or color) for categorization: The reinforcement signal is encoded by the rein- 
forcement neuron. The habit neurons memorize the number of times a given internal criterion 
was used for categorization. The bias neurons combine reinforcement signal and habit signals 
and modulate the internal criterion of the network. The match neurons encode which criterion 
is currently being used for classification. This plays an important role in gating the non-specific 
reinforcement signal with a particular internal criterion. The decision and the ambiguity neu- 
rons are introduced for self-contained, continuous, non-algorithmic functioning of the network. 
Spurious transients that could arise in the F- 2 layer of ART due to competition are filtered by 
the decision neurons. The ambiguity neuron is involved in the selection of one of the possible 
categories in situations when an object can be categorized to more than one category. (Modified 
from Leven and Levine, 1987). 


19 








A similar type of network, used in FRONTAL, enables the robot to decide whether the object 
it is looking at is a good object (and hence pick it) or a bad object (and hence not pick it) . Good 
objects for the robot are those that have been correlated with positive reinforcements and bad 
ones are those which have been correlated with negative reinforcements. 

The frontal network shown in Figure 7 thus enables the robot to scan for objects in its 
environment and to categorize these objects by picking the good and novel ones and by refraining 
from the bad ones. FRONTAL also provides the robot with the ability to modify its internal 
representation of the environment dynamically by interacting with it environment. In conclusion 
FRONTAL enables the robot to self-organize in a dynamic environment . 

2.4 Communication Protocol: Sockets 

With the advent of cost-effective fast dedicated-processors task-specific computers are now widely 
used. Many applications require the development of firmware to communicate between these 
computers. Various standards are available for development of these communication interfaces. 
In this implementation, we initially developed a communication interface using the TCP /IP 
sockets protocol 7 . The communication interface was designed to perform in the simplest manner, 
communication of information by Amdahl supercomputer with the Sun Sparc 2 (running the 
vision system) and the Unix Workstation controlling the arm Figure 12. The sockets approach 
was used instead of the datagram approach so as to ensure reliable communication between the 
computers. The overall strategy was to allow each of the computers controlling the peripheral 
systems (i.e the vision and arm) to independently interact with Amdahl where the FRONTAL 
(brain) system is running. When a new object is introduced in the visual space of the system, this 
information is communicated to the FRONTAL by the vision system via a dedicated socket . On 
receiving this information, a confirmatory signal is sent back by FRONTAL. In a similar manner, 
when FRONTAL decides to initiate a grasp it communicates with the Arm system which in turn 
executes the grasp. On completion of the grasp, the arm system issues a "success' signal to 
FRONTAL. Two different types of message packets are used by FRONTAL to communicate 
with the vision and the arm systems. The message package communicated by the vision system 
comprises of variable size data segments depending on the number of objects present in the 
environment. The message stream is terminated by an end of line terminator as shown in 
Fig 13. Each message stream contains data segments comprising of the following information 
for an object in the environment: the x, y and z co-ordinates of the centroid of the object; its 
shape (whether it is a triangle, square or a circle); and its size (whether it is small medium 

7 A comprehensive discussion of the various communication protocols as well as the TCP/IP protocols is given 
in (Stevens, 1990) 


20 




12: The .■oi.ip.it™ involved i.i <onl.<.lliiig Hi.' svst ' 1 " Mr sl " , ' v ’ 1 Tl,< ' 

-I..' u-tw.nk between lli‘‘ .'.mu— Til- A„„li.l. -I .on.' 

dentes with I, Oil, 111.' Sill, Spar- Station and Hi.' I— .'oni.intev ™‘ ,r "" n, K 






\ 



> object 1 


> object 2 




Figure 13: The message stream from Vision system to FRONTAL 

or large) 8 . The message stream from the FRONTAL system to the arm system however is a 
fixed length message, It has three fields which give the respective x. y and z co-ordinates of 
the centroid of the object to be grabbed. Here too the message stream is terminated by a new 
line terminator as shown in Figure 14. Care has been taken to ensure that the basic unit of 
a message packet is a string of arbitrary length. This enables any structure to be passed as a 
message stream across the system. This initial implementation of the communication protocol 
provided a means to test- the robotic system and its constituents parts. Later a more sophisticated 
communication protocol was implemented using TELRIP. TELRIP is a NASA software package 
that provides interprocess communication protocols between processes running on different Unix 
platforms. TELRIP enabled us to segregate the processes controlling the robotic system from 
those responsible for communication. 

8 Thc size and the shape of the object are represented by bytes taking one of the values 1.2 and 3. Thus, for a 
medium square the last two bytes would be 2 2 respectively 


22 




Message 
Stream to 
Arm 


s' 

x-coordinate 


y-coordinate 


z-coordinate 


end of line character 




Figure 14: The' me'ssage' stream from FRONTAL to t lit' Arm system 

3 Simulations 

In tliis section we present simulations of various neural architectures which elucidate the function- 
ing of the various components that constitute FRONTAL. All of these neural architectures have 
hceii simulated using t In* Amdahl supercomputer. A numerical ODE solver (the Runge-Kulta- 
Fchlbcrg t-o method) developed by Oak Ridge Labs is used for solving the ordinary differential 
equations representing those neural architectures. The equations tor the neural architectures and 
the values of the parameters are presented in Appendix A. 

The user interface of tin* simulation enables the modification of external signals (introduction 
or removal of objects and external reinforcement signals) by interrupting tin' program as and 
when needed. On interruption, all the state' variables of the network are pushed onto the slack 
of t lie computer and the interrupt is handled. On returning back from tin' interrupt, these state' 
variable's are' reloaded back and the network e<|iiatie)iis are- solved from the' same' internal state' ol 
the network IxTore the' interrupt oeeurreel. 

3.1 The novelty detection network 

Simulations demonstrating the capability e>l the 1 neural nctweirk to recognize' novelty are' shown 
in Figure's 15 and 16. Since' the' same' type 1 of network, namely the gated elipolc. is use-el lor 
both spatial anel object type' nove'lty. the- simulations apply te> the* former wliem the' input signals 
eemie' from the' spatiotopic locations and to the- latte-r when tlmy conn' lrom the- category laye r 
of the- e-ate-gorizat iem ne-twork. In this simulation, the- ne-twork shown in Figure- 0 ( ele-se-ril »e*el by 
the- expiations pre-sente'd in Appenelix A. 1) is imple-me-nte-el. Lin- jilots in Figure- 15 re-pre-sent 
I lie- te-mporal se-ejuene'e' in wliie-h four inputs are- pre'se-nte-<! to the- ne-ural ne'twork. a high signal 
implying the- pie-se-nee- of the- input and a low signal implying its re-moval or abse nce. The- plots 


23 


ORIGINAL PAQE IS 
OF POOR QUALITY 



2 


Figure l- r ): This figure along with Figure S demonstrates the novelty detention ea.])al)ili1y <>l I In 
network shown in Figure 2. The lour panels in this figure represent tin 4 sequence in whic h loin 
inputs are presented and removed from Mu' network's environment. A high signal implies I In 
presenm and low signal indirates the absence of 1 he input . Hie response of t hr net work is shown 
in the next ligure. 


ORIGINAL PAOE JS 

OF POOR QUALITY 


















time 



time 



time 



Figure 1C: The panels plot the activity of the four neurons in the "winner-take-all nelwork (sec 
Figure 2). The sequence of inputs presented to the gated dipole circuits that project to these 
competing neurons is shown in the previous figure. 


25 


ORIGINAL PAG£ IS 

OF POOR QUALITY 









ill Figure 16 graph the temporal activity of the .r% neurons of the winncr-take-all network (see 
Figure 0). The horizontal line indicates a threshold value above which the aetivity of any 
neuron implies that the network focuses its attention on the corresponding input. As can be seen 
from the plots of Figure 9. the .r L; * (this notation .r iA refers to the .r* neuron (of the winner- 
takc-all network) which receives excitatory input from tin 1 "d h " ga-te'd-dipolc network sami>fing 
input / ) neuron's activity is above threshold unlil the second object arrives around 227 linn units 
which causes the .r*? ^ neuron s activity to rise' above' threshold. This in turn causes the activity 
of,r, ;< neuron to go below threshold. Similarly the arrival of objects three and lour causes the 
activity of and r 1<;< neurons to respectively go above threshold. Thus, other tilings being 
equal, novelty guides the attention of tin' network. 

3.2 Reinforcement versus novelty 

Reinforcement can bias the attention of the network as demonstrated by the simulation result 
presented in Figure 17. Initially input 1 is presented to the network, to which the network 
immediately responded by activating tin' ./■] ^ neuron above threshold. Following this, input 2 
is presented and the network attends to it on account of its novelty. W lieu input 2 is removed. 

I he network all ends back to input 1. Now. while the network attends to input 1. a positive 
reinforcement is delivered for about 20 linn units. Alter this reinforcement . when input 2 is 
introduced again, the attention of the network remains on input 1 despite the fact input 2 is 
relatively more novel. This is the' result of the previously delivered positive' reinforcement that 
the network associated with input 1. 

Negative reinforcement on the other hand yields opposite effects as shown in Figure' IN. 
When an input is associated with punishment, the network learns to avoid that input, hven 
when the input reappears much later, its novelty is not strong enough to bias tin* networks 
attention towards it. Thus the network learns to avoid punishing inputs even though they could 
bo relat ively novel. Tin' effects of both tin* punishment and the reward fade away with time if 
further reinforcements are not issued and eventually novelty dominates. 

3.3 The delay neuron 

In case of positive' reinforcement, the encoding ol the STM of the reward neuron into LTM 
follows llio classical Hebbian learning rule with decay. This is possible' because* the' reward 
node* is connected via ('xcitatory e'onncetions 1o neurons in 1 lie e-hoiec layer. As a result, when 
reward is delivered this reinforces the activity of tic* Hioior neuron which is supra-t hreshold 
(thereby crediting reward to tin* current choice). 1 his oroate*s a temporal eorre'lali*ni of pro- 
mi d post-svnapt ie act ivitics as required in a Hebbian learning term. However the' "punish node 



600 800 1000 
time 


o 

n 

c 

8 . 

n 

V 

u 


0.1 


0 



/ threshold 

t_ 

' i • i i 


200 400 600 800 1000 1200 
time 


CM 






1 T r- 

_i 

" 

1 T 1 

i i 1 

reward node 

i 1 


200 400 600 800 1000 1200 
time 


1.5 

1 1 1 

1 


0.5 

punish synapse wts 

n 

default wts 

1 l 1 1 1 


200 400 600 800 1000 1200 
t ime 


Figure 17: The effects of positive roinforconiout : Initially the network attends in order In input*' 1 
and 2 due to their novelty. However the application of a reward signal (positive reinforcement ) to 
the network while it is attending to input 1 causes the net work to ignore object 2. even though it is 
relatively novel duo to its reintroduotion. The network associated the positive' reinforcement wit h 
input 1 and this outweighted the novelty of input 2. The encoding of the temporal association 
between the activities of the ntmrdnwl theision neurons into LI.M (i.e into the n ironl s\ napti« 
weights) is shown in the* activity of the reward synapse. 


ORIGINAL PAGE IS 
OF POOR QUALITY 











t iroe 



l 




1 

1 

1 1 1 1 


3 

2 

1 1 1 1 1 1 

- 





reward synapse wta 

default wts 



puniah 

-C 

tT 



1 

L 

- — i L 1 I 

« 

> 

0 

1 1 1 1 1 l 


600 800 1000 
t jjne 


0 200 400 600 800 1000 1200 

t ime 


n 

c 


I 


u 




Figure 18: The' edfert of negative rehifoieement : Initially the network attends to input 
applie-atieni oi ne'gative' reinforcement. the network shifts its attention awin’ from input 
arrival of input 2 cause's I lie network to attend to this novel input. The reappearance o 
1 alter the removal of input 2 is not sufficiently novel to outweigh the eifeets associate 
its punishment. Hence' the- net weak avoiels input 1. The lower rigid panel plots the ene-o 
negative* re'inleim'mcnt signals on the 1 pun /sh syuaplie- weight of the netweuk. 


28 


1. On 
1. The 
f input 
■d with 
ling of 


GiBGWAL p A0E fS 
OF POOR QUALITY 











lias inhibitory connections with neurons in the winner-tako-all circuit in order to depress the 
activity of the neuron which is supra-t hreshold and thereby rapidly force the robot to avoid that 
particular object. As a result of this inhibitory effect, pre- and post -synaptic activities remain 
simultaneously active for a very brief time period (see Figure 18). This leads to an ineffective 
coding of the punish neuron activity into LTM via a Hebbian term which requires a temporal 
correlation of pre- and post-synaptic activities. To avoid this problem, the drlay neuron whose 
STM trace follows a delayed version of the .r ; , iK'uron is introduced along with a modified learning 
rule which is given in the Appendix A. 


3.4 Variable criterion categorization 


In complex environments, it is often necessary to modify criteria used in classification according 
to prevailing conditions. For example, while color may be an adequate criterion to separate good 
and bad apples during certain period of the year, during other times color may be misleading 
while the size of the apples may be more adequate (of. example in Figure 10). Figure 11 describes 
ain't work callable of changing its categorization criterion based on reinforcement signals. In <>id< i 
to avoid noise in reinforcement signals, the network forms "habits” that encode the frequency ol 
behaviors. The criterion of the network is modulated by combining evidence lrom reinforcement 
and habit signals. Simulations demonstrating this property are shown in Figuies 20. 21 and 
The upper three panels of Figure 20 describe' graphically the input presented to the network at 


different time instants. Each input possesses three features. Each feature- has four distinct value s 
(types). For example, a feature can be color and the four types can be* red. blue', yellow, and 
green. Thus a total of 4x1x4 = 64 distinct inputs can bo presented to the network. Each ol 1 lie 
three panels describes one' of the features of the input. The four distinct types oi each leaf lire are 
represented by the different stylings of the "bars”. A set of bars, one from each panel, starting 
at tin' same' time' roprose'iits a particular input presented to the network. The width ol the bars 
represent tin' time taken by the network to categorize this input. The category layer of the AH I 
network used in the simulation has four neurons. Hence, the inputs arc categorized into one ol 
four possible categories. Figure 21 shows the category chosen by the network for a given input 
at different time's. Each panel represents the activity of a neuron in the category layer ol the 
ART- The supra-t hreshold activity in a given panel indicates that the input is categorized to that 


particular category. Category neuron activities have similar styling as the four possible- types ol 
cadi feature. At any instant of time, at most one cat ('gory has a "bar" indicating that the network 
classified the input object to that category. The feature used by the network to categorize the 
input can bo easily identified by comparing which of tlm first three panels ol Figure 20 has a 
similar bar as the category panel at the given instant For example the first input presented to 
the- network is of typo 2 of feature 1. type 1 of feature 2. and type 3 of feature 3. flic network 


20 


ORIGINAL PAGE IS 
OF POOR QUALITY 




1 

— r 

1 ! 1 1 

1 L 

punish node 

1 1 i l 


0 200 400 600 800 1000 1200 

time 


r. 



1 

— t 1 1 1 

- 

/ 

punish synapse wts 


J 

default wts 





0 200 400 800 800 1000 1200 

time 


Figure 10: The three panels shown in this figure dcnionslrale (he activities of the <l<lay. punish 
neurons and the punish synaptic weights after the application of negative reinforcement signal to 
t lie network as discussed in F igure 0. The <l< lay neuron follows the activity of t he ./•.< neuron in a 
delayed lashion so that the STM ol the punish neuron can lie encoded into LIN 1 In- the punish 
synaptic weight s. 


GUKMNAL PAGE IS 

C f POOR QUALITY 


30 








10 

5 

0 


-5 

4J 
3 

g* -10 

-h 0 200 400 600 800 1000 1200 

time 



Figure 20: Th(' Iirst t lir<'<' panels describe* tl i<* feature's of inputs pi ! to the* network at 
dilferent time instants. Each input possesses thr<*e feature's (e.g shape', color ami size) ami each 
feature can take* oik* of four possible values (type's). Hence 64 different inputs can be presented 
to the network. Each of the first three panels represents a feature*. The* four different styles of 
bars in each panel represent the* four different type's of a given feature* (e.g. for color they may 
correspond to white*, blue*, yellow, and ml). At any instant of time*, the* bars repr<*sented by the* 
three pamds describe tlx* properties of the* input presented to the* network e.g.. the* Iirst input 
is of type* 2 of feature* 1. type* 1 of feature* 2 and type 2 of feature 3. The* width of tlx' bars 
repre'semt the* time the' network took to categorize* t hot object. 1 he e*ategori/ntinns performed 
by I he* network are* presented in Figure* 13. The last panel deseribes the reinldreeiuenl signals 
d<*liv<*red to the* ne'twork in re'sponse* to its e-ate*gorization of the' objevt. 


31 


ORIGINAL PAGE IS 
OF POOR QUALITY 






1.4 


Higm r 21 : Thr artivil io 
I his lignrr. Ea< h of tin 1 p 
time'. At any givrn install 
of bars vrpvrsrnt thr fom 
comparing t hr styling of 
n it rrion used by I hr nrp 


GmWiiil PAQE ?S 
OF roofl QtJAUTY 










0. 15 

I T 1 ' 

0 . 1 











habit 1 






thr«»hold • 

0 

3 

200 

400 

600 

800 1000 1200 


t lra« 


0 . 15 

— 

1 

? — 

1- 1 1 











0 . 05 

- 



habit 2 

thraahold 

,iir 

0 

1 


200 

400 

600 600 1000 1200 
t im* 



Figure* 22: The top three panels plot the activities of the bias neurons. Initially ,l "' n«*tw«>rk 
categorizes using feature 3 as the criterion which is indicated by the activity of bias nenron -3 
being above threshold. As a consequence of receiving negative reinforcement signals at a later 
time the internal criterion of the network changes. This is illustrated by a drop in the bias 
nenron 3 activity and in the increase in the activity of bias neuron 2. The internal criterion of 
the network is further changed to feature 1 by issuing negative reinforcement signals at a later 
instance. The bottom three panels plot the activities of the habit neurons. As the number of 
times a. particular criterion is used by the network to eat-gori/e the input object the activity ol 
appropriate halnl neuron increases. As can be seen fr-mi the three plots initiallv the activity ol 
Imlnl neuron 3 increases followed by habit neuron 2 and finally habit neuron 1. 


ic>vyvy.L PAGE !$ 

i-y POOR QUALITY 








3.5 Spatial novelty and attentive scanning in FRONTAL 

Tlir self-organizing autonomous robotic system discussed in the section 2.3 consists of four sub- 
systems: the behavioral categorization system, the object novelty system, the* spatial novel tv 
system and the attentive scanning system. The behavioral categorization system and tin* object 
novelty network function in a similar manner as tin* networks discussed in the previous sub sec- 
tions. hence simulation pertaining to those are not presented. In this section simulations dealing 
with attentive scanning and the spatial novelty system are presented. These two svstems to- 
gether enable t 1 m* robot to "explore lor novel objects, as well as scan attentivelv. the objects 
present in the environment. The lower panel in Figure 1 23 shows the sequence of presentation of 
inputs to the 1 network. Three inputs, placed at three different spatial locations, are presented 
successively to the network. The upper three panels show the activities of neurons representing 
these three spatial locations in the layer where the suprat hreshold activity of a neuron indicates 
the spatial locus ol attention of the robot. Following the introduction of the first input, the robot 
si arts to scan t his input. When the second and 1 bird inputs an' introduced. 1 hr 1 robot "s at tent ion 
sequentially scans all three inputs. As one can see from the simulation results, after some time 
the novelty of inputs vanishes and the robot stops scanning the inputs. 

4 User-Interface 

Two different user interfaces arc' provided for interacting with the robot. The' lirst of these two 
user interlaces is a menu driven interface that can be invoked from any standard terminal. the* 
interlace is evoked when the' FRONTAL simulation receives a user generated interrupt signal 
(Cntrl-C is the' default interrupt signal). The menu provides means to change* reinforcement 
signals as well as to monitor various variable's of the simulation. The second user interlace is a 
X- based interface providing a graphics based environment. The \-bascd user inter ’face consists of 
three windows: two for displaying the' state's of t lie system and one which enables interaction wit h 
the system. One of the two output windows displays the visual input to FRONTAL as we'll as t lie 
object that FRONTAL is currently attending to for categorization. This window also displays 
how FRONTAL categorized the object. The second output window displays the* processing 
stage's ol FRONTAL as it scans, selects, and catcgeui/e's tin* ejbjcct in its environment. The* 
input winelow is similar to that discusseel in non \-window base*el use*r interlace'. It feu) provides 
a. menu driven interface that can be* invoked by a user gcncrate'd interrupt. 


r Of.MT MC 

i P' V'ft (>»MUTY 


31 














5 Limitations 


The robotic system discussed above had some shortcomings due to the vision and robotic arm 
system. The vision system was very sensitive to light intensity and the position of the light 
source. Shadows cast by the object due to different directions of incident light cause BLOBS to 
error in detecting the object typo. Moreover tin' light intensity also effects the performance of 
BLOBS. 

A major limitation of the robotic arm system was the necessity to recalibrate the arm every 
I in 10 the robotic syst cm was started. This is due to the drift in t he pot out iomol ors that calibrate 
Ihe motors of the arm. Hence for a given spatial target location, different set of arm joint angles 
are required for the robot arm to reach the target every time the arm system is shut down. This 
leads to an inconsistent visuo-motor map in the robotic system. Another limitation of the 
robotic arm is a drift observed in tin* /-direction as the arm moves along the y-dimtimi. This 
/-coupling associated with the movement in the y-direction of the arm is shown in Fig 21. As 
can be seen from the three experimental data shown in the graph, there is about a 0.5 inches 
drift in the /-direction as the robotic arm moved linearly in tin* y-direction. Attempts to try to 
model t his non-linear /-coupling did not yield satisfactory results. As the centroids of t lie objects 
were more than 3.0 inches away, this /-coupling did not cause problems in realizing which object 
the robot was intending to grab. However the actual grabbing of the object was not always 
successful. 

6 Conclusion and future work 

111 this report we presented the details ol hardware implementation ot a robot ie system driven by 
a. adaptive neural network. The main weakness of t hr robot resides in the traditional algorit hiinie 
vision and arm coni rol modules. Our future work consists of replacing t liese modules by adapt ire 
neural network module's. 


"The visuo-motor map rotors to mapping ot a spatial !o oat ion i* l*’i iritio«l I >y t ho vision systot n to tin* joint anglos 
roquiml for tin* robot arm to roach that location. 


36 


ononmi. pa ‘ 3 £ is 

OF POOR QUALITY 



-movement 



Figure' 24: The' thm* figure's slum the' ve'rtical drift of 1 lie- reihot arm as it move's in tier horizontal 
elire'e tiem. Tin' t lire'e' graphs re’pre*se'iit three diife're'nt e'xpe'rime'iits. 


37 


ORIGINAL PAQP fS 
OF POOR QUALITY 



A Appendix A 


The equal ions and I he parameters used for t he various net works an* given in t his appendix under 
the various subsections. 


A.l Reinforcement- novelty detection network 

Tin 1 amounts of transmitter ill the “ON" and “OFF" channels of tlx' i th gated dipole are described 
by : ; j and : i:2 respectively (see Figure 0) whose dynamics are given by 


^b, i 
t If 

(It 


n { J — j ) — ' ( / + -//■) bo- 
ot I — ) — ‘ / ' ; ;) . 


( 2 ) 


where <\ is t he transmitter replenishment rale. I is the maximum amount of transmitter. is 
Ihe rate of transmitter depletion. I and are arousal and specific inputs respectively, r, j and 
r, J are respectively the “ON and “OFF channel neurons of t lie i fh gated dipole. They follow 
t he shunting equations 


— O r, j -f- ( B — ./ 

*/.i ) (V + G ) b 

.1 ~~ - ; bl I bV2- 

i ii 

~~ A.i‘j % > + ( B — .f 

‘i.2 ) I b.J “ - r i 

. _>( / -f - A )b .i . 

(• r d 


where A is a passive decay rate. B is the upper saturation level. The “ON and "OFF channel 
outputs are combined by the neurons in the winner-take-all layer (see Figure 0): 


dt 


+ (B — } + G| tt'j r r + — #)) 

~ • r i.:d’ r /.‘2 + GjfC; m pl> + H ^ ./ ( ; ~~ ^)). 


(C>) 


wil li 


/(•»•) = .c'fd.r) 


where G’| . G>. G.\. H and H arc positive const aids. n[.r) is the unit s1c]> function. The activities 
of the reward and punish neurons are given l>v 


(h_ 

df 

'III 

(It 


- 1 1 /■ + (£?, - nil + r, ■ 

- I T p + [B\ — /'I P + ( . 


(S) 
I !)) 


w,-& r. 

Of POOR Qvff r Y 


rjs 



where R and P arc reward ami punishment inputs respectively and C\ is a positive constant. 
Plic reward weight and punish weight , follow 


— — = - hi'i'i.r ~ 1 ) + (-V - ) B->u(r - tt\ 0). (10) 

dll'. 

—j— = -A- 2 (tr , - 1 ) + ( 3 )//(//,■ - H>) 111 ) 

wit h 

( 12 ) 

where .-P>. .1/,.. and 0> are positive constants. //, is tin' activity of the delayed neuron which 

follows the following shouting dynamics. 

—jj’ — — cPp/, + (/?;[ — — #). (lo) 

where .l ;! . Z? :) and ‘,| are ]>osit ive constants. 

A. 2 Reinforcement based classification 

The dynamics of the fc.ntur < neurons of the modified ART network (see Figure 11) follows the 
shunting equation and is given by 

,1 c 1 1 

~j 1 = -A.Vi + (B -.v i )(I i + ^fi< h )-. j ,)-r l (Y,f{'l J )l). / = 1.2 12. (II) 

iU j= > ./—I 

wit li 


where A is the decay constant. B is the upper saturation level. /, is the input applied to the 
network and 1 is the reset signal. j , is the' top down weight from the caU (jori) neuron //, to the 
futiu-n: neuron .r,- as shown in Figure 11. The activity of y/ ; is as follows 


<hij 

<lt 


I 

— - 1 Uj + [D — Hj ) ( ./(//, + [n — ^.>] + e, -I- y //( diiiij.r, ) ) 

i 1 

— t/ j ( ./ ( //»• ) + 2” ) . = l . 2 . -1 . I . 


( 1 ( 1 ) 


30 


ORIGINAL PAGE IS 
OF POOR QUALITY 


wit li 


[.i j + = ,ru\.r I ( 1 i ) 

where .1. B. are positive constants. is the bottom np weight of the AH I network. 12; is a 
bins neuron and n is the uinlntjuih/ neuron with a random weight r, to the cnUiioru neuron //, . 
The bins neuron activity which integrates both habits and external reinforcement signal is given 
by 

E(fi* -0 3 ) + {(/•'- 12* )([/»*■ +n[H} + +uiih)) 

n t .(o[/?] _ + (*Y. //(12,.))}/(«h ) A- = 1.2.1. I bS) 

r-Jtk 




with 


[.!•]- = (10) 

where F. F. G. o. H |.0;j. are positive constants and U is the external K'inforcement signal. The 
dvnamics of the hi- the habit m'liron and the <t>{. the match signal is as follows: 

— - = Hh k .{(J -ln )['h- -E,} + -[<!><■ -*,]*} A- = 1.2.1. (20) 

(If 

dip, Ji, ' 

-T- = . + (13- <h){ X ]T 

— I t — I 

-T<.I A- = 1.2.3. (21) 

where H . •/ and 0, are positive constants. 1 is the reset signal. The neurons in the decision layer 
following t he categorization layer ( see Figure 11) are known as />, and their activities are given 
l)Y. 

-pM'Y."' ' = 1-2.1. L 

,lf 

where .I,. /?, and 11' are positive' constants. Finally 1 lie dynamics of the ambiguity neuron which 
plavs a role in biasing one of the category neurons //, under ambiguous situations is as follows 

, i J i 

— = —An -(- ( B — n 1 } y I \ — n 1 T Y~ >1 1 1 1 >. — ft, I + I) (23) 


III 

om&mi f \ * h 
Of POOrt QVr.GYf 



wit I) 


'/i (■'*) 


ms - n.Oo) 


whewe* .1. B. and T ar r positive' roust mil's. 


A. 3 The combined FRONTAL network 


As I lir FRO NT A L is a eemibinal ion of tin' I wo networks ekse-ribe'd so far. the' dillrrml ial expiat ions 
for variems ])arts of FRONTAL are' similar to those' prrsontod above'. Tln'se' similaritie's will br 
redcrre'el to. se> as to re'elure' lrpe't ition. Tin' spatial mnu liij nctwenk (se'o Figure' 7) mmprise's ol 
an array of gate'd dipole's similar to novelty ne'tweirk eF'srribe'el be'tbre'. The' (liH’rre'iitial expiations 
for this network are' as give'll bedenv 



(It 

— —AtWj j + ( B — rr ?i j)(7 -FTR'-/.! — rr t.\I t’bvj- 

(25) 

(Il'T; j 

(It 

= -dr.r ) ;j + ( B — e\r, •_> ) / r ; , j — r T -AR’-,. i- 

(20) 

(It'-.jJ 

= o ( ) — ri; | ) — “.(/ + ./,-) r :,-j . 

(°7) 

(It 


( 1 (' i 

* “ 

= n( / — ev:,-u) — ~) I r:,vj • 

(28) 

<lt 


whom t lx* prolix r in th<' v 

ariabh' name' imply that the'se gate'd elipole' ne'iirons are* re'latex] 

to visual 


nove'ltv iK'twork. Sinex* the' introduel ion and tin* reanoval of objects from the* robots ruvironiiH'iil 
constitute' a tiovf’l e'venit beith the* "ON and ’OFF e*hanin'ls an* pre'seaitexl as nwitatory inputs 
t < > tin* winiKT-take'-ail laye'r. FnrtlieT more' the* winin'i-take'-all laye'r re'cenve's inhibitory ill ] m 1 from 
the* alh nfivt saanniiuj systean (/,) toe'nable' attentive' scanning. The* dynamics of 1 lie' winneT-l al«'- 
lave'r lmurems are* as give'll Ix'low 


t If 


— Ar.rjj + (B — r.r,-.:i)(C i e'.r, j + ( 2 r r i :2 + bn/ f e'* r >.:e “0)1 

— + B ^ — ft) !■ 


( 20 ) 




wlnae' A. B. M. GV H anel ft are* peisitive' constants. The' rfn ision laye'r ne'iirons />, lilleas 
transie'iits in the' winneT-take' all laye'r and Mm act i ^ i t ^ of a ln'iin'ii in this la \ » a is as follow 


<h } K\ 

(it 


— A p; \ + ( B — p I d I V rj ■; ;; — /y|ir V r.r ' :{ /. / = 1 . 2. ..17 

V— ' 


Ol I 


41 


ORIGINAL PAGE IS 
OF POOR QUALITY 



where .-1. D and M' are positive* constants. 

I Ik* <tll( n/iv( scaiminy layer consists ol 1 lie arousal n<nron /. lie* inhihilintj Inver neurons .\ 
and I lx* delay layer neurons /,. Tlx* dynamics these neurons is as follows 

= + (B - f)AmiiS(il - fHYi l J\l>,;i j = 1 . 2 . (.‘ 12 ) 

= -A. Hi + {B - Hi)G\'j(Pi.\ - 0| ) - - #■>) /■ = 1.2. ,.l. r ). Cl.‘l) 

= — - 1//, + {B - h)Giu(s, - H,) i = 1.2. ..IS. (31 ) 

where .1/ . . L . //. G'i . G’>. #| . H> mid Anrosal me positive constants. 

Tin- hi hatnoral raiifjori’jiiion network which categorizes tlie iii]>nt objects is similar to the 
modified ART model discussed earlier. The equations for this network an* as follows 


£ 

<lt 

(Is, 

~ 

dl, 

(It 


db.v, 

~7T 


<£0 

dt 


(lb . j 

~ 


<Jih 

<11 


<lt 

<£n 

dt 


dp, : > 

dt 

(Id 

~dt 


- Abr ; + (B - h.r , ) I f, + V /(/;//■ )b: . , I 

i 

i 

+ D- > = 1-2....G. 

7=1 

l_> 

- Abu j + (B - bijj)[ f{bi/j) + di[l Li _ 1 ^b.i l )bir l + [n - tf|] + fre, ) 

(=i 

-ty, (£/(/>//,•) +1). J = 1-2.3. 

— Alr.j + ( B - b:j){f[b: j ) + <i\hi! l )Utt IJ ) 

./=> 

f(b-r) +1)- j — 1-2. 

ei j 

-Eii'h-VA + {(/'- < h )([/,*. To /?+ + </(!>,)) 

-S1a(o/? - + G ^ ey ( 0 , ) ) [/($*■) A- = 1.2. 

rsfc/r 

///u {(./ - />*)[$<■ - 0-_,] + - [$*. - "■>]- \ k = 1 . 2. 

:U' :<5 

+ (B - T t .){ X Y. hit dp. ~ 

i-M—l i - 1 

-$ t Z A- = 1.2. 

— Ap,-> + [B — p t:1 )!>:, — />;.•_>( y* /•: + X) / = 1. 2 

J — ' 

— A(l + { B — d) — d(T £ />. : J ! 

r 

./ = !. 2. A- = 1.2 0. 


( 33 ) 


( 3 ( 1 ) 


38 ) 


(•• 13 ) 


( 10 ) 


GRtShviAL PAGE tg 

OF POOR QUALITY 



where I. B. T. F. . F. G. T. R. 0\ mid 0> me positive constants. Jr.; is t he further elassifieal ion 
of I he categorized inputs into "'good and "bad" objects. Tin- division layer lirimms p-, , inhibit 
tin' ambiguity neuron if a behavioral decision is achieved. 

The object novelty network comprises of an ART network coupled to a novelty delcction 
network via a layer of slowly integrating neurons q,. The dynamics of this network are given 
below. The equations for the ART netwoik are as follows 


d/'c, 

dt 


(It 


0prt 

dt 

<I<1, 

dt 


cr 

-Afr; + (B - T ./I f ll.i ) / -,;.i ) 

j=l 

rj 

J(.f !Jj) + %)• ' — 1 -• •■■6- 

i=i 

-Afiij + (B - ./'//, ) ( /(///, ) + X^'/( /•',)/ 

-///,(£ /(//A) + I). j = 1 ■ - 1-- 

+ ( B - )l>!f j - P,,(T. I>!/ J + 2d i = \. 2 12. 

-\- [B — (ji)pi.:t i = 1 . 2 12 . 


I H) 


( Kl) 


where A. B. Ii. , and positive constants. The differential ('([nations for the novelty 
network are as follows 


de.r,. i 
dt 

dry, > 

Tt 

del,.:! 

dt 


de:,.i 

dt 

dr\ 

Tt 

d/y.i 

dt 


= — Ae.r, j + [B - r.i'j \ )(/ + 7 ,)e;,- , - e.r, , /c; ( J . 

= — .--lc.rj .2 + (5 — r.r ; ■>) I <"-i :2 — e.r, _>( / + (Jj)c'.i. l • 

= — -le r, .! + (£? — er,.:! )(e r, j + G:\.h‘ r i.H — 0) + G.([>\ -j ) 

— e.r/.nie.r, > 4- H ^ — 0) + G, y/n.-j)- 

j*< 

= o ( I — e;, , ) — ')(/ + •/; i ■ 

= o(d — e:,-,) - -j /(•:,.>• 

= -Apij + (B — pu )U Vr, ;{ - puU'Yi 1 ''.i .* '-J = 1 • .A. 


( h 


NS) 


( 10 ) 


(btl) 


( r >1) 


where .1. B. G\. G;\. H. F # and II are constants. 

The values for the various parameters used in the ab .ve differential equations a re given in 
I In- following 


43 


ORIGINAL PAGE IS 
OF POOR QUALITY 



B Appendix B 

The MaxYiele-e) system consists of tin' following live nmelnle-s: 

• Aiming Scanner (AS) 

• Architectural Adapter (AA) 

• Analog Generator (AG) 

• Advanced Pipeline Processor (AP) 

• Arithmetic Unit (AU) 

The Analog Scanner Module (AS module) is tin' video input device for the Max Video system. 
It comprises ol three sections; (i) aiming section, (ii) digital section and (iii) timing section. 1 In 
analog section enables the imaging system to select from 4 possible input sources. It is capable 
of DC’ or AC coupling the input signal and low pass filtering it to avoid aliasing. This section can 
also adjust the gain and offset of the signal as well perform DC restoration. The digital section 
digitizes the preprocessed analog input signal with S-bit resolution at rates upto 2GMHz. I he 
digitized images air output through 3 S-bit ports to the A A module. The liming section of this 
module is responsible tor svuehronizing the working ot the other two sect ions. 1 lie s\ nchi oni/ing 
clock signal for this section can come from one of three possible sources. An external clock signal 
generated bv a camera or a sensor, or the horizontal or composite syne from the camera, or any 
arbitrary clock can be used. 

The Architectural Adapter Moduli' ( AA module) is the mot her board oi ♦ he Max\ ideo system. 
It is 1 he only board which connects to the YME bus directly. It is responsible for routing 1 lie 
raw digitized image' via the various modules for processing and displaying. The' AA module is 
thus capable of bolli data path control as we'll as inte-ime'diate- steerage' of the' image be-lwe'e-n 
preu'e'ssing. The' 6 niemorv module's part ot the' AA module 1 acts as seeuree' and sink lo< - ations loi 
images be-ing proee-sse-el. The- erosspehnt swile h. wlmse- -32 input e-onne'etienis can be e-onne-ite-el 
to H2 out pat eeenne'etions. ('iml)l('S the- appropriate- muting of the- image- stored ill th<- me-umiy 
metelules via Max\ ide-e> module's anel bae-k to the- me-mory nmeluh's. The- appreipriate- connect iem 
e-an be- preigranime'el using the- Image-Flenv seiltware-. Image's stem-el in various me-mory locations 
can be- trails] lare-nlly aeve-sseel ove-r ihe YME bus during the- aeepiisit ie»n or elisplay ol the- image-. 

The- Analog Ge-ne-rator moelule- (AG module-) of the- Max\'iele'e> Yieh-o syste-m is responsible- 
lor e-einve-rting the- preiee-sse-el digital elala t < > a varie-ty video format. This moeluh- accepts digital 
data in euie- of live image- elisplay umde-s depending e m t h< - >nt put data pi <-eision ( t lie- oul ]>nl data 
pre-e-isiems suppe>rte-el by the- MaxYieh'ei syste-m have be- n stated earlier). I In- live- image- elisplay 
moeles that e-an be- se-le-ete-el are- as feilleiws. 

14 


l>: prg-R QUALITY 



• 8-bit Greyscale. Image 1 Memory Module's 0. 1. or 2. 

• 8-bit Ps< duocolor. Image' Memieiry Module's 0.1. or 2. 

• 2-1-1 >it RGB (8:8:8). Image' Memory Moelnle's 0. 1 anel 2. 

• 8-1 >it True' Color (3:3:2). Image Me'morv Me>e1ule's 0.1. e>r 2 and 

• 15-1,11 True- Color (5:5:5). Image Me-meiry Module's 0 and 1. 

The- hvo braeke-te-el quantities represent the- manner in whieli the- data sternal in tli<' Image' Me mon 
mexlule's are mappe-el te, i e-pre-se-nt the <'e,le»r-value<s. As .*ael, Image Memory moelul.' < an store- only 
8 hit planes. 16 anel 24 bit plane' images require* the- use- of me, re* than erne- Image Me-movy module*. 
The* Display Timing Generator ge-ne-rate's the appropriate- syne- anel blanking signals lor t he \ane-t \ 

of elisplay vielee) output. 

The- Aelvanee-el Pipe line* Proee-ss meielule’ (AP mexlule-) e-onsists e,I thre-e- proee-ssmg de-viee-s 
whie li e-nable* the nmeluh'S te, pe-form a variety of ejpe-ratiems on image's. The- first ot tlie'se- elevie e-s 
is a statistical preiee-ssor whieli is e-apable- e,f provieling 24-bit histogram re'sults on 8-bit plane- 
image- data. This elewie-e* is also eapabh' of eletevting up to 512 fe*ature*s in a 512 x 512 pixel 
image anel pe-rform a moelifie-el He, ugh transform on a image to fine! locations having f.-atur.-s 
with a give-11 angle-. Four (8 X 8) bank look-up table's are- pmvieh-el which ne«*el to be- use-el in 
e-onjune-tion with the- latter two tasks te, stem- the- features anel the angle's te, be- ele-te-e te-.l in the 
image. Alse, the- four banks can be- use-el for gen<-ri<- look-up table. The- se-eoliel elewie-e- e alle-el t h<- 
NM AC e-an be- use-el in twei meiele-s. In the first me,ele. it p.-rforms a ne-ighborhoeiel 8 X 8 multiply 
anel aerumulate- whie'h can use-el for pe-rfe, ruling e-«,nve,luti<,n of the- image- with a pre-se-t mask. In 
the- se-e-eniel me, ele-. the- NMAC can be use-el as a 2 se-parate- 8x4 NMAC s. This split mode- in 
e-onjune-tion with a LUT table- e-an be- nse-el te, pe-rfe, rm Se,be-1 e-elge- eh'le-et imi in ne-ar re-al-lunw 
The (hire I anel final elewie-e is a 16 x 16 LUT that e-an pe-rfe, rni morphological operations on a 3 
x 3 binary ne-ighbe, rhe, oel. This device* is capable <»f producing a 16 x 16 bit output that consists 
(( f a 3 * 3 neighborhood <,f all the pixe-ls arenmel the current pixel in the' input binary image. 

The- fifth and final module e,f the- MaxYidee, system, is the- Arithme-tie- unit (AU ele-vire-). 
14, is AU ele-viee- has e,f five se-etiems: (i) Input see-tie, n. (ii) Binary Cre,sspoint see-tioli. (in) Gn-v 
Se-ale- Cre,sspe,int se-e tion. (iv) Output se-e lion. (v) Linear Pre,ee-sse,r se-etien, anel (vi) No„-Lin«*ar 
Pre,ee*sse,r se e tiem. The Input se-etion takes 8/16 bit two's eemiple-me-nt elata anel e-emve-rts ,t to 
10/20 bit twos eeunph-uie-nt elata. This 10/20 bit data can them be route-el te, the- Line-ar ami Non- 
I ..inear Proe-e-ssor via the- Binary anel Grey Se ale- Cross point se-.-tie,ns d.-pe-mling on "he-lhe-r 1 lo- 
in, age- is a binary or grey se-ale-. 20 bit images are- hamlh-.l by routing 2 (10-bit ) paths. The- Lim a, 
Proee-ssor can pe-rform a varie-ty of lim-ar ope-rations whi- 1, im lmle aelelitmu and mulliplxation 
of the 10/20 bit image- elata image- data slre-ams. The Non -Lim-ar Processor e-onsisls of 0. 10-b,t 

45 


ORIGINAL PAGE IS 
OF POOR QUALITY 


ALUs or 3 20- bit ALUs and receives 4-10 bit from tin' Grey Seale Crosspoint section. The Binary 
Crosspoint section connects to the Non-Linear Processor enabling binary images to control the 
selection of ALU operation on tlie 10/20-bit images. This important property of the Non-Linear 
Processor enables certain binary properties of tin* image to regulate the processing that needs to 
be done on it. 


46 

Ofttf&'-L F/tf* ts 

Of POOR QUALITY 



Current sensor 



Fi g1 , rr 25: The schematic of the control of the gripper. On receiving a command to close. I he 
motor starts to close the digits of the gripper. As the gripper grasps the object the current sensor 
circuit ensures that an adequate grasp of the object is achieved without damaging the object, by 
limiting the amount of current allowed. When the motor is commanded to release the object t lie 
motor starts moving the digits of 1h<' gripper until tin- contact switch feedbacks a stop signal. 

C Appendix C 

A schematic of t In' motor control mechanism for tin' two digit gripper is shown in Figure 25. The 
PC microprocessor in conjunction with current switch controls the opening and closing oi the 
gripper bv applving appropriate voltage polarity to gripper motor. Tin' current switching emit 
provides a feedback to the current switch of the amount of current received by the motor while 
it grasping an object. Tin- contact switch on the other hand sense when the digits ol gnpper 
have reached it maximal open position and intimates the current switch. The microprocessor 
communicates with a request issuing computer via a serial line. On receiving a request the to 
close or to open the gripper, the microprocessor sends the appropriate voltage to the grippe 
motor via its parallel port. When a grip of an object is requested, the digits ol the grippe < lost 
()I1 to the object. As the force exerted by the digits on the object bring gripped reaches a preset 
value the current sensing circuit prevents further force being applied by the digits by restricting 
,he current to the preset mine. Similarly during the' opening of the digits, the contact switch 
turns oil - the current applied to the' motor once- the maximal opening is reached. 


47 


0R1G!NA1. PAGE IS 

Or POOR QUALITY 









References 


[1] Craig. .1. J.. Introduction to robotics: mechanics and control. Addison-Weslov: M»ss. 1080. 

[2] Levon . S. J.. and Levine D. S.. Efforts of reinforcement on know lodge retrieval and eval- 
uation. Proct.edinys of the First International Conj< rc.nct on AY and Net works. San Die.yo: 
IEEE/ICNN. vol. II. i>i>. 269-270. 1087. 

[3] Levine D. S.. and Prueitt P. S.. Modeling some efforts of frontal lobe damage-novelty and 
perseveration. Neural Networks, vol. 2. pp. 103-116. 1080. 

[4] Piaget -J.. The. oriyins of intelliyt ii.ee in child rtn. Norton: New York. 1063. 

[A] Piaget J.. La psycholoyie. dc I i n tell iyrnct- . Armand Colin: Paris. 106/. 

[6] Piaget J.. Tin mechanisms of percept ion. Basie Books: New York. 1060. 

[7] Piaget .1.. Psycholoyie i t episteniotoijn . Denoel Gouthier: Paris. 10 <0. 

[8] Stevens R. \V.. Unit network proyraniiiiiny. Prentice Hall: Now Jersey. 1000. 


OSi^NAi, ??>'& If- 
Q;" C { *AL*TY 


48 



Parameters | Novelty Networl 


1.0 


Reinforcement Network 

FRONTAL. N.-nvmk 

10.0 

1.0 

5.0 

2.0 

1.0 

1.0 


1.0 


1.0 


3.0e-04 

10.00 


o 

liline j 


CV, 

0.05 

G>2 

0.5 


0.5 


1.0 


0.000 1 


0.005 


2.0 


0.005 


3.0 


3.0e-04 



0.0 

0.25 

0.0 

1.0 

0.85 

1.0 

1.0 


\V 


/ 


Aron sal 


H 


H , 



1.0 


0.005 

10.0 

10.0 

0.01 

0.01 

3.0 

5.0 

3.0 

3.0 

10.0 

10.0 



49 






















































































